Model Selection

Adversarial learning

# Adversarial learning

VITS is an end-to-end speech synthesis model capable of predicting corresponding speech waveforms from input text sequences. The model employs a conditional variational autoencoder (VAE) architecture, including a posterior encoder, decoder, and conditional prior module.

Speech Synthesis

kakao-enterprise

Vits2 Ru Natasha

Russian text-to-speech model based on VITS2 architecture, trained with Natasha dataset, providing efficient and natural speech synthesis capabilities.

Speech Synthesis

Transformers Other

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase